Goertzel algorithm

The Goertzel algorithm is a digital signal processing (DSP) technique for identifying frequency components of a signal, published by Gerald Goertzel in 1958[1]. While the general Fast Fourier transform (FFT) algorithm computes evenly across the bandwidth of the incoming signal, the Goertzel algorithm looks at specific, predetermined frequencies.

A practical application of this algorithm is recognition of the DTMF tones produced by the buttons pushed on a telephone keypad[2][3][4].

It can also be used "in reverse" as a sinusoid synthesis function, which requires only 1 multiplication and 1 subtraction per generated sample.[5][6]

Contents

Explanation of algorithm

The Goertzel algorithm computes a sequence, s(n), given an input sequence, x(n):

s(n) = x(n) %2B 2 \cos(2 \pi \omega) s(n-1) - s(n-2)

where s(-2) = s(-1) = 0 and \omega is some frequency of interest, in cycles per sample, which should be less than 1/2. This effectively implements a second-order IIR filter with poles at e^{%2B2 \pi i \omega} and e^{-2 \pi i \omega}, and requires only one multiplication (assuming 2 cos(2 \pi \omega) is pre-computed), one addition and one subtraction per input sample. For real inputs, these operations are real.

The Z transform of this process is

\frac{S(z)}{X(z)} = \frac{1}{1 - 2 \cos(2 \pi \omega) z^{-1} %2B z^{-2}} = \frac{1}{(1 - e^{%2B2 \pi i \omega} z^{-1})(1 - e^{-2 \pi i \omega} z^{-1})}

Applying an additional, FIR, transform of the form

\frac{Y(z)}{S(z)} = 1 - e^{-2 \pi i \omega}z^{-1}

will give an overall transform of

\frac{S(z)}{X(z)} \frac{Y(z)}{S(z)} = \frac{Y(z)}{X(z)} = \frac{(1 - e^{-2 \pi i \omega}z^{-1})}{(1 - e^{%2B2 \pi i \omega} z^{-1})(1 - e^{-2 \pi i \omega} z^{-1})} = \frac{1}{1 - e^{%2B2 \pi i \omega} z^{-1}}

The time-domain equivalent of this overall transform is

y(n) = x(n) %2B e^{%2B2 \pi i \omega} y(n-1) = \sum_{k=-\infty}^{n}x(k) e^{%2B2 \pi i \omega (n-k)} = e^{%2B2 \pi i \omega n} \sum_{k=-\infty}^{n}x(k) e^{-2 \pi i \omega k},

which becomes, assuming x(k) = 0 for all k < 0

y(n) = e^{%2B2 \pi i \omega n} \sum_{k=0}^{n}x(k) e^{-2 \pi i \omega k}

or, the equation for the (n%2B1)-sample DFT of x, evaluated for \omega and multiplied by the scale factor e^{%2B2 \pi i \omega n}.

Note that applying the additional transform Y(z)/S(z) only requires the last two samples of the s sequence. Consequently, upon processing N samples x(0)...x(N-1), the last two samples from the s sequence can be used to compute the value of a DFT bin, which corresponds to the chosen frequency \omega as

X(\omega) = y(N-1) e^{-2 \pi i \omega (N - 1)} = ( s(N-1) - e^{-2 \pi i \omega} s(N-2) ) e^{-2 \pi i \omega (N - 1)}

For the special case often found when computing DFT bins, where \omega N = k for some integer, k, this simplifies to

X(\omega) = ( s(N-1) - e^{-2 \pi i \omega} s(N-2) ) e^{%2B2 \pi i \omega} = e^{%2B2 \pi i \omega} s(N-1) - s(N-2)

In either case, the corresponding power can be computed using the same cosine term required to compute s as

X(\omega) X'(\omega) = s(N-2)^2 %2B s(N-1)^2 - 2 cos(2 \pi \omega) s(N-2) s(N-1)

Implementation

When implemented in a general-purpose processor, values for s(n-1) and s(n-2) can be retained in variables and new values of s can be shifted through as they are computed, assuming that only the final two values of the s sequence are required. The code may then be as follows:

s_prev = 0
s_prev2 = 0
normalized_frequency = target_frequency / sample_rate;
coeff = 2*cos(2*PI*normalized_frequency);
for each sample, x[n],
  s = x[n] + coeff*s_prev - s_prev2;
  s_prev2 = s_prev;
  s_prev = s;
end
power = s_prev2*s_prev2 + s_prev*s_prev - coeff*s_prev*s_prev2;

Instead of storing the history of samples in an array it is also possible to process the incoming samples iteratively in real-time.

Computational complexity

To compute a single DFT bin for a complex sequence of length N, this algorithm requires 2N multiplications and 4N additions/subtractions within the loop, as well as 4 multiplications and 4 additions/subtractions to compute X(\omega), for a total of 2N+4 multiplications and 4N+4 additions/subtractions (for real sequences, the required operations are half that amount). In contrast, the Fast Fourier transform (FFT) requires 2log2N multiplications and 3log2N additions/subtractions per DFT bin, but must compute all N bins simultaneously (similar optimizations are available to halve the number of operations in an FFT when the input sequence is real).

When the number of desired DFT bins, M, is small (e.g., when detecting DTMF tones), it is computationally advantageous to implement the Goertzel algorithm, rather than the FFT. Approximately, this occurs when

M < \frac{5}{6} \log_2 N

or if, for some reason, N is not an integral power of 2 while you stick to a simple FFT algorithm like the 2-radix Cooley-Tukey FFT algorithm, and zero-padding the samples out to an integral power of 2 would violate

M < \frac{5 N_{\mathrm{padded}}}{6 N} \log_2\left(N_{\mathrm{padded}}\right)

Moreover, the Goertzel algorithm can be computed as samples come in, and the FFT algorithm may require a large table of N pre-computed sines and cosines to be efficient.

If multiplications are not considered as difficult as additions, or vice versa, the 5/6 ratio can shift between anything from 3/4 (additions dominate) to 1/1 (multiplications dominate).

References

  1. ^ Goertzel, G. (January 1958), "An Algorithm for the Evaluation of Finite Trigonomentric Series", American Mathematical Monthly 65 (1): 34–35 
  2. ^ Mock (1985)
  3. ^ Chen (1996)
  4. ^ Schmer (2000)
  5. ^ http://www.dattalo.com/technical/theory/sinewave.html
  6. ^ http://cs-www.cs.yale.edu/c2/images/uploads/AudioProc-TR.pdf

External links